Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available August 1, 2026
-
Cluster randomized trials (CRTs) are commonly used to evaluate the causal effects of educational interventions, where the entire clusters (e.g., schools) are randomly assigned to treatment or control conditions. This study introduces statistical methods for designing and analyzing two-level (e.g., students nested within schools) and three-level (e.g., students nested within classrooms nested within schools) CRTs. Specifically, we utilize hierarchical linear models (HLMs) to account for the dependency of the intervention participants within the same clusters, estimating the average treatment effects (ATEs) of educational interventions and other effects of interest (e.g., moderator and mediator effects). We demonstrate methods and tools for sample size planning and statistical power analysis. Additionally, we discuss common challenges and potential solutions in the design and analysis phases, including the effects of omitting one level of clustering, non-compliance, threats to external validity, and cost-effectiveness of the intervention. We conclude with some practical suggestions for CRT design and analysis, along with recommendations for further readings.more » « less
-
Cost-effectiveness analysis studies in education often prioritize descriptive statistics of cost-effectiveness measures, such as the point estimate of the incremental cost-effectiveness ratio (ICER), while neglecting inferential statistics like confidence intervals (CIs). Without CIs, it becomes impossible to make meaningful comparisons of alternative educational strategies, as there is no basis for assessing the uncertainty of point estimates or the plausible range of ICERs. This study is designed to evaluate the relative performance of five methods of constructing CIs for ICERs in randomized controlled trials with cost-effectiveness analyses. We found that the Monte Carlo interval method based on summary statistics consistently performed well regarding coverage, width, and symmetry. It yielded estimates comparable to the percentile bootstrap method across multiple scenarios. In contrast, Fieller’s method did not work well with small sample sizes and treatment effects. Further, Taylor’s method and the Box method performed least well. We discussed two-sided and one-sided hypothesis testing based on ICER CIs, developed tools for calculating these ICER CIs, and demonstrated the calculation using an empirical example. We concluded with suggestions for applications and extensions of this work.more » « less
-
This study introduces recent advances in statistical power analysis methods and tools for designing and analyzing randomized cost-effectiveness trials (RCETs) to evaluate the causal effects and costs of social work interventions. The article focuses on two-level designs, where, for example, students are nested within schools, with interventions applied either at the school level (cluster design) or student level (multisite design). We explore three statistical modeling strategies—random-effects, constant-effects, and fixed-effects models—to assess the cost-effectiveness of interventions, and we develop corresponding power analysis methods and tools. Power is influenced by effect size, sample sizes, and design parameters. We developed a user-friendly tool, PowerUp!-CEA, to aid researchers in planning RCETs. When designing RCETs, it is crucial to consider cost variance, its nested effects, and the covariance between effectiveness and cost data, as neglecting these factors may lead to underestimated power.more » « less
-
Multilevel regression discontinuity designs have been increasingly used in education research to evaluate the effectiveness of policy and programs. It is common to ignore a level of nesting in a three-level data structure (students nested in classrooms/teachers nested in schools), whether unwittingly during data analysis or due to resource constraints during the planning phase. This study investigates the consequences of ignoring intermediate or top level in blocked three-level regression discontinuity designs (BIRD3; treatment is at level 1) during data analysis and planning. Monte Carlo simulation results indicated that ignoring a level during analysis did not affect the accuracy of treatment effect estimates; however, it affected the precision (standard errors, power, and Type I error rates). Ignoring the intermediate level did not cause a significant problem. Power rates were slightly underestimated, whereas Type I error rates were stable. In contrast, ignoring a top-level resulted in overestimated power rates; however, severe inflation in Type I error deemed this strategy ineffective. As for the design phase, when the intermediate level was ignored, it is viable to use parameters from a two-level blocked regression discontinuity model (BIRD2) to plan a BIRD3 design. However, level 2 parameters from the BIRD2 model should be substituted for level 3 parameters in the BIRD3 design. When the top level was ignored, using parameters from the BIRD2 model to plan a BIRD3 design should be avoided.more » « less
-
Extant literature on moderation effects narrowly focuses on the average moderated treatment effect across the entire sample (AMTE). Missing is the average moderated treatment effect on the treated (AMTT) and other targeted subgroups (AMTS). Much like the average treatment effect on the treated (ATT) for main effects, the AMTS changes the target of inferences from the entire sample to targeted subgroups. Relative to the AMTE, the AMTS is identified under weaker assumptions and often captures more policy-relevant effects. We present a theoretical framework that introduces the AMTS under the potential outcomes framework and delineates the assumptions for causal identification. We then propose a generalized propensity score method as a tool to estimate the AMTS using weights derived with Bayes Theorem. We illustrate the results and differences among the estimands using data from the Early Childhood Longitudinal Study. We conclude with suggestions for future research.more » « less
-
We used the generalized propensity score method to estimate the differential effects of five Early Child Care and Education (ECCE) experiences (Prekindergarten, Head Start, Center-based Child Care, Home-based Child Care, and Parental Care) in reducing math and reading achievement gaps between boys versus girls, Latinx versus Whites, and Blacks versus Whites. Findings revealed differential effects of ECCE in reducing gender and racial achievement gaps. However, results indicated that significant gender and racial gaps still exist despite ECCE experiences and that these gaps widen throughout the elementary and middle school years.more » « less
An official website of the United States government
